Comparing sets of patterns with the Jaccard index
نویسندگان
چکیده
منابع مشابه
HyperMinHash: Jaccard index sketching in LogLog space
In this extended abstract, we describe and analyse a streaming probabilistic sketch, HYPERMINHASH, to estimate the Jaccard index (or Jaccard similarity coefficient) over two sets A and B. HyperMinHash can be thought of as a compression of standard logn-space MinHash by building off of a HyperLogLog count-distinct sketch. For a multiplicative approximation error 1+ on a Jaccard index t, given a ...
متن کاملEstimating Jaccard Index with Missing Observations: A Matrix Calibration Approach
The Jaccard index is a standard statistics for comparing the pairwise similarity between data samples. This paper investigates the problem of estimating a Jaccard index matrix when there are missing observations in data samples. Starting from a Jaccard index matrix approximated from the incomplete data, our method calibrates the matrix to meet the requirement of positive semi-definiteness and o...
متن کاملEvaluating the Jaccard-Tanimoto Index on Multi-core Architectures
The Jaccard/Tanimoto coefficient is an important workload, used in a large variety of problems including drug design fingerprinting, clustering analysis, similarity web searching and image segmentation. This paper evaluates the Jaccard coefficient on the the Cell/B.E.processor and the Intel R ©Xeon R ©dual-core platform. In our work, we have developed a novel parallel algorithm specially suited...
متن کاملJaccard-Spline index of structural proximity in contact networks
Network analysts are increasingly being called upon to apply their expertise to groups for which the only available or reliable data is a contact network.With no opportunity to gather additional data, themerits of suchapplicationsdependonempirical studies that validate theemploymentof structural constructs based on contact networks. Fortunately, we possess such studies in abundance. One of the ...
متن کاملOptimization of the Jaccard index for image segmentation with the Lovász hinge
The Jaccard loss, commonly referred to as the intersection-over-union loss, is commonly employed in the evaluation of segmentation quality due to its better perceptual quality and scale invariance, which lends appropriate relevance to small objects compared with per-pixel losses. We present a method for direct optimization of the per-image intersection-over-union loss in neural networks, in the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Australasian Journal of Information Systems
سال: 2018
ISSN: 1449-8618,1449-8618
DOI: 10.3127/ajis.v22i0.1538